Unbiased degree-preserving randomisation of directed binary networks 
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Randomising networks using a naive 'accept-all' edge-swap algorithm is generally biased. Building on re- 
cent results for nondirected graphs, we construct an ergodic detailed balance Markov chain with non-trivial 
acceptance probabilities for directed graphs, which converges to a strictly uniform measure and is based on 
edge swaps that conserve all in- and out-degrees. The acceptance probabilities can also be generalized to define 
Markov chains that target any alternative desired measure on the space of directed graphs, in order to generate 
graphs with more sophisticated topological features. This is demonstrated by defining a process tailored to the 
production of directed graphs with specified degree-degree correlation functions. The theory is implemented 
numerically and tested on synthetic and biological network examples. 



I. INTRODUCTION 

When seeking to assess the statistical relevance of observa- 
tions made in real networks, there are three routes available. 
One could generate null-model networks for hypothesis test- 
ing from scratch, constrained by the values of observed pa- 
rameters in the real network (e.g. using the Molloy-Reed stub 
joining method [ 1 1, or the Barabasi-Albert preferential attach- 
ment model [2 1). Alternatively, one could generate null-model 
networks by randomising the original network, using dynami- 
cal rules that leave the values of relevant parameters invariant 
0. The final option is to use analytical methods to find en- 
semble averages for the observable of interest, see e.g. (4]|5). 

The null-model approach is appealing in its conceptual sim- 
plicity. It effectively provides synthetic 'data', which can 
be analysed in the same way as the real dataset. One can 
then learn which observed properties are particular to the real 
dataset, and which are common within the ensemble. 

Applications of network null-models are wide ranging and 
central to network science. [6| applies null models to iden- 
tify over-represented 'motifs' in the transcriptional regulation 
network of E. coli. (7 1 discusses adapting the Watts-Strogartz 
method to generating random networks to model power grids. 
[8 1 explores motifs found within an interfirm network. J5) 
uses network null-models to study social networks. [ 1 1 com- 
pares topological properties of interaction and transcription 
regulatory networks in yeast with randomised 'null model' 
networks and postulated that links between highly connected 
proteins are suppressed in protein interaction networks. IfTTI 
discusses the challenges of specifying a suitable matrix null 
model in the field of ecology. 

It is crucial that the synthetic networks generated as null 
models are representative of the underlying ensembles. Any 
inadvertent bias in the network generation process may inval- 
idate the hypothesis test. It is therefore worrying that the two 
most popular methods to randomise or generate null networks 
are in fact known to be biased. The common implementation 
of the stub-joining method, where invalid edges are rejected 
but the process subsequently continues (as opposed to start- 
ing from the beginning of the whole process), is known to be 
biased lfT2UT4l : in fact, even if upon invalid edge rejection 



the stub-joining process is restarted, it is not clear whether the 
graphs produced would be unbiased (we are not aware of any 
published proof). Similarly, the conventional 'accept-all' edge 
swap process, see e.g. fTBl . is also known to be biased fl6l : 
graphs on which many swaps can be executed are generated 
more often. The effects of these biases may in the past not al- 
ways have been serious IfTTI . but using biased algorithms for 
producing null models is fundamentally unsound, and unac- 
ceptable when there are rigorous unbiased alternatives lfl6l . 

In this paper we build on the work of [ 16 1 and |3| and de- 
fine a Markov Chain Monte Carlo process, based on ergodic 
in- and out- degree preserving edge-swap moves that act on 
directed networks. We first calculate correct move acceptance 
probabilities for the process to sample the space of all allowed 
directed graphs uniformly. We then extend the theory in order 
for the process to evolve to any desired measure on the space 
of directed graphs. Attention is paid to adapting our results 
for efficient numerical implementation. We also identify un- 
der which circumstances the error made by doing 'accept all' 
edge swaps is immaterial. We apply our theory to real and 
synthetic networks. 



II. AN ERGODIC AND UNBIASED RANDOMISATION 
PROCESS WHICH PRESERVES IN- AND OUT-DEGREES 

A. Edge swap moves 

The canonical moves for degree-preserving randomisation of 
graphs are the so-called 'edge swaps', see e.g. Ifl6l [T8l [19l . 
The undirected version of the edge swap is illustrated in fig- 
ure [TJ a generalisation to directed graphs is found in J3). The 
authors of |3] define a move - which we will refer to as a 
square swap - starting from a set of four entries from the 
connectivity matrix c e {0, \ } N ~ of a directed binary A^-node 
graph, defined by node pairs {{h, ]2),{h, ji),{h, 

such that the corresponding entries fc;, , c,-, j 2 , c^j 2 , c,-, j, } are 
alternately and 1, and not 'structural' (i.e. they are allowed 
to vary). As for the undirected case, the elementary edge 
swap move is defined by swapping the and 1 entries, i.e. 

{ c ii7i' c 'ij2' c i2j2' c ':,;'i) — * {l _c 'i>i' 1 -C iij2> ^~ c k,h> l -c 'Wi)' 
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FIG. 1. The undirected edge swap. This is the canonical choice for 
the elementary moves of an ergodic degree-preserving randomisation 
process on undirected networks. 
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how move acceptance probabilities should be defined to guar- 
antee stochastic evolution by edge swapping to any desired 
measure on the space of nondirected graphs. The analysis in 
lfl6ll is quite general, and briefly reproduced in section II B be- 



low. Here we will adapt their calculations to directed graphs 
and include the new moves defined by [3). This will result in 
a Markovian process based on edge swapping that will equili- 
brate to any desired measure on the space of directed graphs. 



B. Outline of the general theory 

This section briefly summarizes results of [ 16 1 which will be 
used in the next section. We define an adjacency matrix c = 
{cij}, where c (/ = 1 if and only if there is a directed link from 
node j to node i. We denote the set of all such graphs as 
C. The aim is to define and study constrained Markov chains 
for the evolution of c in some subspace Q e C. This is a 
discrete time stochastic process, where the probability p t (c) 
of observing a graph c at time t evolves according to 
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FIG. 2. The square swap (top) and triangle swap (bottom). In com- 
bination these two represent the canonical choice for the elementary 
moves of an ergodic degree-preserving randomisation process on di- 
rected networks without self-interactions. 



VceO : p t+1 (c) = ^ W(c\c')p t (c') (1) 
e'en 

where t e IN and W(c\c') is a transition probability. We re- 
quire the process to have three additional properties: 

1 . Each elementary move F can only act on a subset of all 
possible graphs. 

2. The process converges to the invariant measure 



The authors of Q prove that, if self interactions are permit- 
ted, repeated application of such moves can transform any bi- 
nary matrix Ca to any other binary matrix Cb with the same 
in- and out- degree distributions. However, if we require in 
addition that the diagonal elements of all c are (i.e. we for- 
bid self-interactions), then the edge swap defined above is no 
longer sufficient to ensure ergodicity. To remedy this prob- 
lem the authors of introduce a further move, which we 
will call a triangle swap. This move also gives us the sim- 
plest demonstration of two valid configurations that cannot be 
connected by square-type swaps. The square swap and the 
triangle swap are illustrated in figure [2j in combination these 
two moves allow us to transform between any two directed bi- 
nary matrices which have the same in- and out-degrees, even 
if self-interactions are forbidden |f3] . 

A stochastic process defined as accepting all randomly se- 
lected moves from the above set is ergodic but biased. This 
was already observed in 0, where the authors proposed a 
'Switch & Hold' algorithm, which involves the number of 
states accessible in one move from a configuration (its mobil- 
ity), and the maximum possible number of states accessible in 
one move from any network in the ensemble (the degrees of 
a hyper-graph, in the language of later publications). In ifTBI 
the problem was studied for undirected graphs; it was shown 



3. Each move F has a unique inverse, which acts on the 
same subset of states as F itself. 

We exclude the identity move from the set <t> of all moves, 
and we define an indicator function If(c) where If(c) = 1 iff 
the move c — > Fc is allowed. The transition probabilities are 
constructed to obey detailed balance 

Vcc'eQ : W(c\c')p x (c') = W{c'\c) Poo {c) (2) 

At each step a candidate move F e <E> is drawn with probabil- 
ity q{F\c'), where c' is the current state. The move is accepted 
with some probability A(Fc'\c'). In combination this leads to 

W(c\c') = J] [8c,fc>A{Fc'\c') + 8 c ,c [l-A(Fc'\c')]] 

(3) 

Insertion into Q then leads to the following conditions which 
must be satisfied by A(F c'\c') and q(F\c'): 

(Vc e Q)(VF e <D) : (4) 
q{F\c)A{Fc\c)e- H(C) = q{F- l \Fc)A{c\Fc)e- H{FC) 

We define the mobility n(c) to be the number of moves which 
can act on each state: n(c) = 'Efc® If(c)- If the candidate 
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moves are drawn randomly with equal probabilities from the 
set of all moves allowed to act, we find Q reducing to 



A(c\c') 



n{c , )e -\mc)-mc)} 



n{c')e 2 



i[#(C)-H(C')] 



+ «(c)ei [H(c) - ff(c ' )] 



(5) 



If we make the simplest choice H(c) = const, the above pro- 
cess will asymptotically sample all graphs with the imposed 
degree sequence uniformly. To sample this constrained space 
of graphs with alternative nontrivial probabilities p^ic) we 
would choose H(c) = - log p x {c) + const. 

Equation [4] also shows what would happen if we were to 
sample with A(c\c') = 1 for all (c,c'), i.e. for 'accept all' 
edge swapping: the detailed balance condition would give 

(Vc e Q)(V J F e <D) : e" H(C) «(c) = e - H(FC) n(Fc) (6) 

For this to be satisfied we require both sides of the expression 
to evaluate to a constant. Hence e~ H( - c ^ oc n(c), so the naive 
process will converge to the non-uniform measure 



Poc(c) = Z n(c) 



(7) 



This is the undesirable bias of 'accept-all' edge-swapping. It 
has a clear intuitive explanation. The mobility n(c) is the num- 
ber of allowed moves on network c, which is equal to the num- 
ber of inverse moves through which c can be reached in one 
step from another member of the ensemble. The likelihood of 
seeing a network c upon equilibration of the process is propor- 
tional to the number of entry points that c offers the process. 



C. Calculation of the mobilities for directed networks 

Since the two types of moves required for ergodic evolution of 
directed graphs, viz. the square swap and the triangle swap, 
are clearly distinct, the mobility of a graph c is given by n(c) = 
n n (c) + n A (c), where n a (c) and n A (c) count the number of 
possible moves of each type that can be executed on c. 

To find n n (c) we need to calculate how many distinct link- 
alternating cycles of length 4 can be chosen in graph c. We 
exclude self-interactions, so our cycles must involve 4 distinct 
nodes. The total number of such moves can be written as 



"□(c) = - 6 jk5u8 ik 6 jjCjjCktCkjCtf 

ijkt 



(8) 



where the pre-factor compensates for the symmetry, and 
where we used the short-hands cy = 1 - c k j and 8jk = 1 — 5j k - 
Expanding these shorthands in dSl gives after some further 
bookkeeping of terms, and with (c')y = cjf. 

n n (c) = ^Tr(ccW) - £ Kc u kf l + Tr(cc f c) + ^Tr(c 2 ) 

U 

j 

with (k) = AT 1 Yjik™ = AT 1 £,.jt° ut . We next repeat the cal- 
culation for the case of the triangle swap. For easier manipu- 



lations, we introduce a new matrix c 1 of double links, defined 
via (c^)ij = CijCjj. We then find 

«a(c) = ^ Sjjd jkSkiCjjC jk c ki c jjC k f ik 
ijk 

= ^{Tr(c 3 ) - 3Tr(cV) + 3Tr(c t2 c) + -Tr(c t3 )} 
= -Tr((c - c 1 ) 3 ) 



(10) 



In combination, Q and ( [T0| give us an explicit and exact 
formula for the graph mobility n(c) = n n (c) + n A (c), and 
hence via <|3j a fully exact MCMC process for generating ran- 
dom graphs with prescribed sequences and any desired prob- 
ability measure in the standard form Z 1 exp[-H(c)]. Since 
( 9|10| l cannot be written in terms of the degree sequence only, 
neglecting the mobility (as with accept-all edge swapping) 
would always introduce a bias into the sampling process. 



III. PROPERTIES AND IMPACT OF GRAPH MOBILITY 

A. Bounds on the mobility 

We will now derive bounds on the sizes of the mobility terms. 
This may show for which types of networks the application of 
'accept all' edge swapping (which ignores the mobility terms) 
is most dangerous, and for which networks the unwanted bias 
may be small. We first observe that 

n A (c) = r ^ c y (l - Cji)c jk {\ - c kj )c ki (l - c ik ) < -Tr(c 3 ) 

ijk 

Hence, the mobility n(c) = n n (c) + n A (c) obeys 

n(c) < ^Tr(ccW) - k^c u k° ut + Tr(cc f c) + ^Tr(c 2 ) 

U 

+ 1 -N 2 (kf-Y J k l ;k° M + 1 -Tv(c 3 ) (11) 

j 

We find upper bounds for most of the terms above by applying 
the simple inequality cyCa < |(cy + q/), which gives e.g. 

Tr(cc f c) < - [(k in 2 > + (k out 2 >] Tr(c 2 ) < N(k) (12) 



Tr(ccW)<2fci%A:? 



Tr(c 3 ) < N(k in k out ) (13) 



An upper bound on 2,y k™Cjjk° follows from the observation 
that if aj = 1 then certainly k™ > 1 and £° ut > 1. Hence 



= -N[{k m 2 > + (k oa 2 >] (14) 
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Combining ( jl2|13|14] i with (111 then gives 

n(c) < ^[N{k) 2 + <*> + l -[(k in 2 ) + (k oa 2 >] - ^(« out >l 

(15) 

Next we calculate a lower bound for «(c). We use simple iden- 
tities such as 

Tr(c 2 ) > n A (c) > Tr(cc f c) > (16) 

and 

Tr(cc t cc t ) > i ^ CjiCj k c ek cci(6j{ + fo) 
= AT[(fe™ 2 ) + <£ out 2 )] 

We now find 

n(c) > l -N[{k™ 2 > + (k° at 2 )] + l -N 2 (k) 2 - J] ft**" 



(17) 



(18) 



We finally need an upper bound for 2y k^c^k^K which we 
write in terms of fej?.„ = max,- A: m and fc° u ' = max,- k° ut : 

U U 

= ^[Cax<fc 0Ut2 > + O^ n2 >] d9) 

We thus obtain our lower bound for the mobility: 

n(c) > £ [ N (k} 2 + ((k [n -k°» 1 ) 2 } - C<r' 2 ) - Cax<^ in 2 >] 

(20) 



B. Identification of graph types most likely to be biased by 
'accept all' edge swapping 



We know from <(3j that unbiased sampling of graphs, i.e. 
p(c) = 1 /|fi| for all c e Q, requires using the following state- 
dependent acceptance probabilities in the edge swap process: 



A(c\c') = [1 +n(c)/n(c')]- 



(21) 



We now investigate under which conditions one will in large 
graphs effectively find n(c)/n(c') — > 1 for all c, c' e Q., so that 
the sampling bias would be immaterial. Let us define 

An = max \n(c) - n(c')\ = maxn(e) - min n(c) (22) 

CC'eSl Cen CeCl 



Using the two bounds ( 15|20| i we immediately obtain 
An < ^[(k) - \[{k™ 2 )+(k oat 2 >] + ^ in £ out > 

+Ca X <fc OUt2 > + Cax^ n2 >] 

= f [<*> - ^[<^ in 2 )+(k oul 2 >] - \((k m - k°«) 2 } 

+ C x <fe° Ut2 ) + Cax^ ln2 )] 



N 



[<fc) + Cax^ OUt2 > + Cax^ m2 )] 



(23) 



Clearly 1 - A«/n(c) < n(c')/n(c) < 1 + An/n(c), so in view of 
pTj ) we are interested in the ratio An/n{c), for which we find 

A«_ < <fc) + C ax <fc° Ut2 )+Ca t x<fc in2 ) (24) 

n(c) N (k) 2 - kZ x (k° M 2 ) - kZAk™ 2 ) 
So we can be confident that the impact of the graph mobility 



on the correct acceptance probabilities (21 1 is immaterial if 



^ + ^2 (Cax<£° ut 2 > + kZ,ik m 2 >) « N (25) 

We see from this that we can apply the 'accept all' edge-swap 
process with confidence when we are working with a large 
network with a narrow degree distribution. 



IV. MOBILITIES OF SIMPLE GRAPH EXAMPLES 

In this section we confirm the validity of the mobility formulae 
( 9|10| l for several simple examples of directed graphs. 



1 . Two isolated bonds: 

Here we have C\% = 1, C34 = 1, and Cy = for all (i, j) t 
{(1,2), (3,4)}. It is immediately clear that = 0, and 



2*^0*5* = 2, J] k T k 7 = °> {k) 



„2\ _ rp ,3 



N 



Tr(c £ ) = Tr(c J ) = Tr(cc T c) = 0, Tr(cc T cc T ) = 2 



Insertion into ( 9|10 l gives n n (c) = 1 and n A (c) = 0. As 
we would expect: only one (square) move is permitted. 

2. Isolated triangle: 

This example is defined by en - C23 = C31 = 1, with 
c u = for all (i,f) ( {(1,2), (2, 3), (3,1)}. Again we 
have = 0, but now 

U j 

Tr(c 2 ) = Tr(cc l c) = 0, Tr(c 3 ) = 3, Tr(cc t cc t ) = 3 



This results in n n (c) = and njc) = 1. The only 
possible move is reversal of the directed triangle. 

3. Complete (fully connected) graph: 



5 



Here Cy = 1 - <5,-/, and no edge swaps are possible. All 
nodes have k™ = k° m = N—\, and since - c we know 
that n A (c) = 0. This connectivity matrix, also featured 
in El, has eigenvalues A — N - 1 (multiplicity 1) and 
A = — 1 (multiplicity N — 1). Hence 

Tv(c 2 ) = YjAl =N(N-l), 

i 

7r(cc l c) = Tr(c 3 ) = ^/l 3 = N(N-l)(N-2), 

i 

Tr(ccW) = Tr(c 4 ) = = (iV-l)[(iV - l) 3 + 1] 

i 

Assembling the entire expression for the square mobil- 
ity term ([9]) indeed gives the correct value n n (c) = 0. 

4. Directed spaning ring: 

This directed graph, defined by c,j = <5, + i ;i modulo yV, 
gives a ring with a flow around it. We choose N > 2. 
Once more c 1 = 0, and we obtain for the relevant terms 

Tr(c 2 ) = Tr(c 3 ) = Tr(cc' c) = 0, Tr(cc 1 cc 1 ) = yV 

The final result, n a (c) = ^N(N - 3) and n A (c) = 0, is 
again what we would expect. As soon as one first bond 
to participate in an edge swap is picked (for which there 
are yV options), there are yV - 3 possibilities for the sec- 
ond (since the already picked bond and its neighbours 
are forbidden). The factor 2 corrects for over-counting. 

5. Bidirectional spanning ring: 

Our final example is the nondirected version of the pre- 
vious graph, viz. Cjj — fijj-i + £>,j + i modulo yV, with 
N > 2. Since = c we have n A (c) = 0. Now 

Yjkfcijkf 1 = 8N, Y,k° ut kJ=4N, (k) = 2 
U J 
Tr(c 2 ) = 2N, Tr(c 3 ) = Tr(cc f c) = 
Tr(ccW) = 6yV 

We thereby find n D (c) = 2N(N - 4). This is double 
the mobility evaluated in [16|, since every move in the 
undirected version of the network corresponds to two 
possible moves in the directed version of the network. 



V. A RANDOMISATION PROCESS WHICH PRESERVES 
DEGREES AND TARGETS DEGREE-DEGREE 
CORRELATIONS 

So far we applied formula <|5j for the canonical acceptance 
probabilities for directed graph edge swapping to the prob- 



lem of generating graphs with prescribed in- and out-degrees 
(k m , k out ) and a uniform measure. Here consider how to gen- 
erate graphs which, in addition, display certain degree corre- 
lations. We first rewrite |5]l as 



A(c\c') 



I + "( C ) C H(C)-H(C) 

n(c') 



-1 



(26) 



These probabilities (26i ensure the edge-swapping process 
evolves into the stationary state on Q. = {c e {0, 1}" | k m (c) = 
k m , k oa \c) = F ut ) defined by p M (c) = Z 1 exp[-#(c)]. The 
full degree-degree correlation structure of a directed graph c 
is captured by the joint degree distribution of connected nodes 



with k = (k ln ,k° ut ). The maximum entropy distribution on 
£2, viz. all directed graphs with prescribed in- and out-degree 



sequences, that has the distribution ( 27 1 imposed as a soft con 



straint, i.e. £cen p(c)W(k, k'\c) = W(k, k') for all (k, k'), is 

7-1 



p{c)=Zr^\\6 iiim (28) 

i 



(see 1201), in which Q(k,k') = W{k,k')/ p(k)p(k') and p(k) = 
p(k ln ,k° m ). It is now trivial, following [16|, to ensure that 
our MCMC process evolves to the measure (f28"]> by choosing 



H(c) = - log p(c) in the probabilities (26 1. This gives 
A(c\c') 



1 + 



n(c) pr f Q(kJjy ij + (l-f Qjkjjtyl-c'j) 



n(c') 11 



<A> 

A' 



Q(kJj) ) 



l 



(29) 



If the proposed move is a square edge swap, it is characterized 
by four distinct nodes (i, j, k, €), and takes us from a graph c' 
with c'ijC'^c'^c'tf = 1 to a new graph c with CifuCkfil = 1 
(leaving all other N 2 - 4 bond variables unaffected). For such 



moves the acceptance probabilities ( 29 1 become 



A D {C\C') = 



1 + 



Me) \(k)Q(k t ,kj) )\(k)Q(k„kc) ) 

n{c')( n A( n A 

\(k)Q(k„kj) )\(k)Q(kJ r ) ) 

For large yV we may choose to approximate this by 
n(c) Q(kiJj)Q(k,kt)" ] 



(30) 



A D (c\c') 



1 + 



(31) 



If the proposed move is a triangle edge swap, it is charac- 
terized by three distinct nodes (i,j,k), and takes us from a 
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graph c' with c'^c'^c'^c'-f'^c'^ — 1 to a new graph c with 

CijCjkCki c jiCk fik - 1 (leaving all other N 2 - 6 bond variables 
unaffected). Now the acceptance probabilities (|29]> become 



A A (c\c') 



1 + 



11(C) \(k)Q(k h ki) )\(k)Q(k k ,kj) J\(k)Q(k,M ) 



(32) 
l 



*Ml_^_{\l_ ! ^_ l \(_s^_{\ 

\{k)Q(,k,,kj) )\(k)Q(kj,k k ) j\(k)Q(k t ,k,) j 

For large N we may choose to approximate this by 



A A (c\c') = 



1 + 



n(c) Qikuk^QikjMQih^i) 



(33) 
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VI. NUMERICAL SIMULATIONS OF THE CANONICAL 
RANDOMIZATION PROCESS 

In this section we describe numerical simulations of our 
canonical MCMC graph randomization process and its 'ac- 
cept all' edge swapping counterpart, applied to synthetic net- 
works and to biological signalling networks. The most conve- 
nient marker of sampling bias in randomisation is the mobility 
n(c) itself, which we will therefore use as to monitor the dy- 
namics of the process. We used the Mersenne Twister random 
number generator from CT1 . For numerical implementation, 
we use expressions for the incremental change in the mobility 
terms following a single edge swap move (similar to how this 
was done for nondirected networks |fl6l ) - see appendix [A] 
This avoids having to calculate n(c) after each move, which 
would involve repeated matrix multiplications. Full source 
code (in C++) and Windows executables of our implementa- 
tion are available on request. 



A. Split flow network 




FIG. 3. The possible realisations of a split flow type network, with 
N = K + 2. The left hand configuration has a mobility of K(K - 1); 
there is only one such configuration. The right hand configuration 
has mobility of 2K - 3; there are K(K - 1) such configurations. 

A split flow network, see e.g. IPTl . is built as follows. Node 
; = 1 has degrees (k™, k° M ) = (0, K), we have K nodes (i - 
2...K+1) with degrees (/tj n , fc° ut ) = (1, 1), and a final node 
with degrees (k^ +2 , &™| 2 ) - (K, 0). There exist two types of 
graph with this specified degree sequence. The first is shown 



FIG. 4. Comparison for 'split flow' networks with K = 25 of ran- 
domization via 'accept all' edge swapping (squares) versus edge 
swapping with the canonical acceptance probabilities (crosses). The 
mobility (n(c)) is used as a dynamical observable, since its expecta- 
tion value is sensitive to sampling bias. Each marker gives the av- 
erage mobility over 10,000 iterations. Observed values are in good 
agreement with theoretical predictions: (n(c)) » 58.32 for 'accept 
all' edge swapping (predicted: 58.52), versus («(c)) » 47.95 for cor- 
rect edge swapping (predicted: 47.92). 



in the left of figure [3] The second type is obtained from the 
first by choosing two of the K 'inner nodes', of which one will 
cease to receive a link from i = 1 and the second will cease to 
provide a link to i = K + 2; so the mobility of the left graph 
is n(c) = K(K - 1). On the right-hand side configurations 
in figure [3] we can execute three possible square edge swap 
types: returning to the previous state (1 move), changing the 
internal node that is not receiving a link from i = 1 (K - 2 
moves), or changing the internal node that is not sending a 
link to i = K + 2 (K - 2 moves), giving a total mobility for 
the graphs on the right of «(c) = 2K - 3. The total number of 
such split flow networks is \Q\ = K(K - 1) + 1. 

Figure |4] shows graph randomisation dynamics for a split- 
flow network with K = 25, comparing 'accept all' edge 
swapping (which would sample graphs with the bias p(c) = 



n(c)/ iZc'en n ( c ')) to the canonical edge swap process (21 



that is predicted to give unbiased sampling of graphs p(c) = 
1/|Q|. The predicted expectation values of the mobilities in 
the two sampling protocols are 



'accept all' : (n(c)) = 



canonical: <«(c)> 



Zc^V) _ 5^ 2 - 13^+9 



PI 



2(^-1) 
2K(K-\) 2 
\+K{K-l) 



58.52 



47.92 



The simulation results confirm these quantitative predictions 
(see caption of figure [4] for details), and underline the sam- 
pling bias caused by 'accept all' edge swapping, as well as 
the lack of such a bias in our canonical MCMC process. 
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FIG. 5. The directed version of a 'nearly hardcore' network. Given 
the imposed degree sequences, there are only two types of graphs: 
the one shown here, and the one obtained by via an edge swap that 
involves the nodes of the isolated link and two nodes from the core. 



0- 
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FIG. 6. Illustration of the edge swap that transforms a nearly hard- 
core graph from state A to one of the type B states. 



B. Nearly hardcore networks 

'Nearly hardcore' networks are another example of graphs for 
which 'accept all' edge swap sampling are known to exhibit a 
significant bias [ 16 1. The directed version of such networks is 
constructed from a single isolated bond plus a complete sub- 
graph of size K — N - 2. See figure[5] Triangle swaps are not 
possible. From the graph shown in the figure (the 'mobile' 
state, A) there are K(K- 1) ways to choose two nodes of the 
core to combine with the two non-core nodes to form an edge 
swap quartet, hence this state has n A {c) = K(K— 1). After an 
edge swap the graph in figure [5] is replaced by one in which 
one non-core node receives a link from the core, and the other 
sends a link to the core; see figure [6] There are K(K- 1) such 
graphs, to be called type B, hence the total number of nearly 
hardcore graphs is |Q| = K(K-1)+1. From each type B graph 
the inverse swap can be applied, plus 2(K-2) further moves 
that each equate to replacement of one of the core nodes in- 
volved in the previous swap by another. Hence ns(c) = 2K—3. 
These statements are confirmed by formula ([9]). 

The predicted expectation values of the mobilities in the 
two sampling protocols, 'accept all' edge swapping (which 
would sample graphs with the bias p(c) = n(c)/ iZc'en n ( c ')) 
and the canonical edge swap process (21 1 (predicted to give 
unbiased sampling of graphs p(c) = 1/|Q|), are 



n 2 A (c)+K(K-l)n 2 B (c) 

'accept all': (n(c))= — - — = 

P n A (c)+K(K-l)n B (c) 

n A (c) + K(K-l)n B (c) 



canonical: (n(c)}- 



l+K(K-l) 



5K 2 -13K+9 
2(K-l) 
2K(K-\f 
' \+K{K-l) 



Figure [7] shows graph randomisation dynamics for a nearly 
hardcore network with K = 18 (so N = 20). Here the theory, 



40 



<«(c)> 



35 



30 



100000 200000 300000 400000 500000 

iterations 



FIG. 7. Comparison for 'nearly hardcore' networks with K = 18 of 
randomization via 'accept all' edge swapping (squares) versus edge 
swapping with the canonical acceptance probabilities (crosses). Each 
marker gives the average mobility over 10,000 iterations. Observed 
mobility values are again in good agreement with theoretical pre- 
dictions: (n(c)) ~ 41.09 for 'accept all' (predicted: 41.03), versus 
(n(c)) x 33.92 for correct edge swapping (predicted: 33.89). 



i.e. the previous two formulae, predicts that we should see 
(n(c)) « 41.03 for 'accept all' edge swapping, and (n(c)) « 
33.89 for unbiased sampling. Again the simulation results 
confirm our predictions (see caption of figure[7]for details). 



C. Application to gene regulation networks 

Gene regulation networks are important examples of directed 
biological networks. Figures [8] and [9] show numerical results 
of the randomization dynamics applied to the gene regula- 
tion network data of (22) (with N - 5654 nodes) and (2J1 
(with N = 3865 nodes), respectively. We apply all three ran- 
domization processes discussed so far in this paper, viz. 'ac- 
cept all' edge swapping, canonical edge swapping aimed at 
uniform sampling of all graphs with the degree sequences of 
the biological network, and canonical edge swapping aimed 
at uniform sampling of all graphs with the degree sequence 
(k\ , . . . , kff) and (on average) the degree-degree correlation 
kernel W(k, k') of the biological network. In contrast to the 
synthetic examples in the previous subsection, in gene regu- 
lation networks we do not observe significant divergence be- 
tween 'accept all' versus canonical edge swap randomization; 
this is similar to what was observed earlier for the randomiza- 
tion of protein-protein interaction networks in ifTBI . We also 
see that in both cases the biological network is significantly 
more mobile than the typical network with the same degree 
sequence. However, figure [8] suggests that the set of networks 
that share with the biological one both the degree sequence 
and the degree correlations (and hence resemble more closely 
the biological network under study) all have high mobilities. 

Implementating degree-degree correlation targeting di- 
rectly has the effect of severely reducing the space of graphs 
through which the process can pass, hence we would expect 
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D. Targeting degree-degree correlation 

In addition to being unbiased, the canonical MCMC process 
in this paper can sample according to any specified measure 
on the space of degree-constrained graphs. The particular ex- 
ample which we've developed is the generation of directed 
graphs from the tailored ensemble (|28] >, via the acceptance 
probabilities ( 30|32 1. Figures [8] and [9] show the trajectory of 
this process for two different datasets. Figure 10 is provided to 
illustrate that the network corresponding to this process suc- 
cessfully reproduces the key features of the assortativity of the 
real network. In particular, the characteristic downwards slope 
was postulated by IflOll to be a key feature of protein networks, 
associated with greater stability and improved specificity. 



FIG. 8. Randomization dynamics for the gene regulation network 
data of 1 22 1 . The observable shown is a rolling average of the nor- 
malized average square mobility (n n (c))/N 2 . We compare 'accept 
all' edge swapping (+), canonical edge swapping aimed at uniform 
sampling of all graphs with the biological degree sequence of the bi- 
ological network (□), and canonical edge swapping aimed at uniform 
sampling of all graphs with the degree sequence (£,,..., k N ) and the 
degree-degree correlation kernel W(k, k') of the biological network 
(A). Hamming distances between the start and end networks of □, + 
and A were 0.8, 0.8 and 0.75 respectively. 



3.55 



("(C)) 



3.5 



3.45 

500000 1000000 1500000 2000000 

iterations 

FIG. 9. Randomization dynamics for the gene regulation network of 
1231 . The key and the axes are the same as in figure [8] Hamming 
distances between the start and end networks of □, + and A were 
0.94, 0.94 and 0.86 respectively. 



finite-size effects to be more pronounced. The process would 
be less restricted, and hence more natural, with a smoothed 
target degree-degree correlation. There is a trade-off between 
the flexibility of the process and the accuracy of the targeting. 
We have used a light Gaussian smoothing, generalising what 
was used in 11241 to the higher dimension we need. The best 
choice target degree-degree correlations - including decisions 
about smoothing - will very much depend on the particular 
problem being studied. 



VII. CONCLUSION 

In this paper we have built on the work of J3] and 1T61 to define 
an ergodic and unbiased stochastic process for randomising 
directed binary non-self-interacting networks, which keeps 
the number of in- and out- connections of each node con- 
stant. The result takes the form of a canonical Markov Chain 
Monte Carlo (MCMC) algorithm based on simple direct edge 
swaps and triangle reversals, with nontrivial move acceptance 
probabilities that are calculated from the current state of the 
network only. The acceptance probabilities correct for the en- 
tropic bias in 'accept all' edge-swap randomization, which is 
caused by the state dependence of the number of moves that 
can be executed (the 'mobility' of a graph). 

Our process is precise for any network size and network 
topology, and sufficiently versatile to allow random directed 
graphs with the correct in- and out-degree sequence to be gen- 
erated with arbitrary desired sampling probabilities. The al- 
gorithm can be used e.g. to generate truly unbiased random 
directed graphs with imposed degrees for hypothesis testing 
(in contrast to the 'edge stub' algorithm or the 'accept all' 
edge swap algorithm, both of which are biased), or to gener- 
ate more sophisticated null models which inherit from a real 
network both the degree sequence and the degree correlations, 
but are otherwise random and unbiased. 

Our core insight is similar to |25| and J3). However, our 
work takes the formalism further, and generates a direct ad- 
justment to the MCMC based on the current state of the net- 
work only, rather than a retrospective adjustment to the ob- 
served process [25 1 or a search of the entire state-space Q. 
Moreover, our approach can be generalised to generate more 
tailored null-models (e.g. our example of targeting a specified 
degree-degree correlation). 

We have derived bounds to predict for which degree se- 
quences the differences between 'accept all' and correct ran- 
domization (i.e. the effects of sampling bias) are negligible. 
Application to synthetic networks showed a large discrepancy 
between the 'accept all' and correct randomization processes, 
and good agreement with our theoretical predictions for the 
values of key observables that are affected by the entropic bias 
of incorrect randomization. For the biological networks which 
we studied (gene regulation networks) we find the differences 
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FIG. 10. These charts summarize the degree-degree correlations 
observed in the original network (middle row o), the final network 
after the process targeting the flat measure (top row □) and the pro- 
cess tailored to preserve the degree-degree correlation pattern of the 
original network (bottom row A). The data used is based on 1231 
and the process shown in figure [9] The left hand charts summarize 
the correlation between the in-degree of a node and the average out- 
degree <^nJ|')in of its in-neighbours. The right hand charts summarize 
the correlation between the out-degree of a node and the average in- 
degree (&jS,)out of its out-neighbours. This representation was chosen 
as a widely adopted and easy to interpret measure of the assortativity 
of a directed network. 



between correct and incorrect sampling in the space of graphs 
with imposed degree sequences to be negligible. However, 
this cannot be relied upon to continue in future studies, espe- 
cially when network datasets become less sparse, or random- 
ization processes which target more complicated topological 
observables are used. 

Biological signalling networks tend to have 'fat-tailed' dis- 
tributions with low average degree and relatively high cluster- 
ing levels, whereas in a graph ensemble defined by prescrib- 



ing in- and out- degree sequences and uniform graph prob- 
abilities, graphs will typically have 0(1) triangles per node 
or less. Hence, if we run edge swap processes on such en- 
sembles, by the time equilibration is approached the algorithm 
will typically be moving through networks with low cluster- 
ing, where the change in mobility coming from those terms 
that 'count' triangles will be very low. However, this will be 
different if we target a non-flat measure, for instance if we 
generate graphs with degree-degree correlations. Since bio- 
logical degree-degree correlations seem to be associated with 
clustering, it will become increasingly dangerous to assume 
that the sampling bias caused by using 'accept all' edge swap 
dynamics will be modest. 

Given that precise and practical alternatives are now avail- 
able, we feel that there is no justification for the use of biased 
graph randomization processes. In those cases where we seek 
to generate unbiased random directed graphs with in- and out- 
degrees identical to some observed network, our canonical 
MCMC process would take the observed graph as its seed and 
take care of the required unbiased sampling. In those cases 
where we specify degree sequences ab initio, without having 
a seed graph, one may use the Molloy-Reed algorithm to gen- 
erate a (biased) seed graph prior to running our algorithm. 

In addition to being rigorously free of entropic sampling 
bias, our present canonical MCMC process is also able to gen- 
erate directed degree-constrained networks with any arbitrary 
specified sampling probabilities. We have shown examples 
of the generation of synthetic graphs generated with precisely 
controlled expectation values for the degree-degree correla- 
tion kernels, where the imposed sampling measure is a maxi- 
mum entropy distribution on the set of graphs with prescribed 
degrees, with degree correlations imposed as a soft constraint. 
Degree correlation is a promising candidate to define a better 
null model, as it has been observed in the literature to act as 
a 'signature' distinguishing different types of networks (e.g. 

Two directions for future research could be to look at 
weighted networks (e.g. to integrate our ideas with those in 
papers such as (28]), or at bipartite networks (which also have 
interesting applications, see e.g. l29l ). Furthermore, it would 
seem appropriate in the field of network hypothesis testing to 
take more seriously the nontrivial number of short loops in bi- 
ological signalling systems. Whenever we randomize within 
the large amorphous space of graphs that inherit from the bi- 
ological network only the degree sequence, we are effectively 
running a dynamics on graphs that are locally tree-like, where 
(conveniently) the mobility issues are minor. But we know 
already that this large set will typically produce null models 
that are very much unlike biological networks, for that same 
reason. How informative are small p-values in this context? 
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Appendix A: Efficient calculation of changes in mobility terms 
following one move 

Calculating the mobility n(c) terms is computationally heavy. 
Given that our moves are simple and standard, we follow the 
alternative route in |[T6l and derive formulae for calculating 
the change in mobility due to one move, so that we can avoid 
repeated heavy matrix multiplications at each time step. 



1. Change in n n (c) following one square-type move 

Without loss of generality, define our square move to be the 
transformation between matrix c and x, involving four nodes 
(a, b, c, d), such that for all (z, j): Xy - cy + Ay, with 

Ay = 6 ia 6 jd + 6 ic 6 jb - 6 ia 6 jt - S ic 6jd 

We now determine the overall change induced in n n (c) by 
finding the impact of an edge swap on each term in ([9j. on 
the right hand side of the expression above. 
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• Term 1: 

Tr(;rx t xjt t ) - Tr(cc t cc t ) = ^ [c, 7 c i; Q m c im 
-(Cy + Aij)(c kj + A kj )(c km + A km )(c 

— AijC k jC km Ci m + ... + A[jA k jC km Ci m + ... 

+ AjjA k jA km c im + ... + AjjA k jA km A im 

where ... refers in each case to three similar terms (with 
their appropriate indices). Let us inspect what happens 
when two A terms are multiplied together. We might 
have the first suffix repeated, the second suffix repeated, 
or no repeated sufficies: 

AjjA im - 2[Sjd(6 mc i - 5 m b) + 5jb(5 m b - <W)1 

A u A kj = 2[5 ia (5 ka - 6 kc ) + 6 ic (6 kc - 6 ka )] (Al) 

One immediately observes that 



AijA k jA km Ai, 

ijkm 



= 4^ [6i a 6 ia (6 ka 6 ka + 6 kc 6 kc ) + 8 ic 5 ic (5 kc 5 kc + 6 ka 6 ka )] 

ik 

= 16 

To handle two A terms with different sufficies we use 

AijC k j = c kb (S ic ~ d ia ) + cm (Sia - S ic ) (A2) 
which leads us to 



^ ^ A/ jC k jA km Ci m — 4 

ijkm 



Returning to the result Al it follows that 



AijA k jC im c km - 2{5 ia {5 ka - 8 kc ) + 6 ic (d kc - 5 ka ))c im c km 

and the symmetric term gives 

AijAi m c k jC km = 2(k d n + kf) - AcidCn, 
For the third order terms we combine \A\\ and \A2\ : 

/ , AjjA im A kj c km = 2 2^ [ [Si a (6 ka -5 k c) + Sic(6 kc -6 ka )] 

ijkm ik 

x [6 ia c kd + SicC k b - 6 ia c kb - 6 ic c kd ] J 

= 2(c ad - C c d - C ab + C c b - Cab + C c b + C ad - C c d) 



By permutation of sufficies all such terms evaluate 
to -8. Finally we turn to the four terms where 
only one A appears, corresponding to permutations of 



A'ijC k jC km Ci m — C kd C km C am + C kb C km C C m C kb C km C am 

CkdCkmCcm- Adding up all separate elements above, we 
obtain the change in the square mobility term due to one 
application of a square move: 

A^Tr^c'cc 1 ")] = 2(^ n + + k° m + k° ul ) 

{Ckd^km^am ^kb^km^-cm ~ ^kb^km^-am ~ ^kd^km^cm) 

-4 (cidCib + c am c cm + 1) (A3) 



Term 2: 

y u 

= k T l jd + Si C 5jb - 5 ia 5 jb - died jd\ kj 



,out,in h°ut,ia _ tout tin _ uoatim 
"a K d K c K b K a K b K c K d 



(A4) 



Term 3: 



Tr(xx^x) - Tr(cc f c) 
= 2j [ c 'i + A 'j] [ Ck i + Ak -i] f Qi + Aki ^ ~ c U c Ki c ki 



ijk 



= [ A kjCijC ki + AijC kj c ki + A ki c u c k j + AtjAkjCu 

ijk 

+ A k jA ki Cij + AijA ki c kj + AyAjyAa] 
The product of two A terms gives 
AyAjy = 6 ik \6 jd (6 ia - S ic ) + 5j b (6ic - (5 ia )J 

-5jd(6ia6 kc + SicSka) ~ Sjb{S ic 5 ka + diad kc ) 

but Ay-Ay = 0, and in a straightforward way we obtain 

^ A k jCijC ki = 2^ \s ka 8jd + (Jfc^i - 6 ka 8jb - 6 kc 5jd\ CijC ki 

ijk ijk 

— ^ ^ ^a/Qd "t" CciCib ~ C a iCib C c /C/jJ 
i 

Assembling all terms and their symmetric equivalents 
leads to an expression which can be summarised as 

A[Tr(cc f c)] = MutN(a, d) + MutN(c, b) - MutN(a, b) 
-MutN(c, d) - 2(c bd + c d b + c ac + c ca ) 

(A5) 

where 

MutN(a,jS) = ^ [caiCi/j + c ai Cfji + CfoC^g] (A6) 
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• Term 5: 



It follows that 



A [Tr(c 2 )] = Tr(* 2 ) - Tr(c 2 ) 
= 2_j [ c, 7 + (diadjd + 5i C 8 jb - 5i CI 5 jb - 5 ic <5 jd )] 

'j 

x [cji + (6 ja 6id + 6 jc 6 ih - 5j a 5 ib - 5j C 5 id )] - Tr(c 2 ) 

= 2(Cda + Cbc - C ba ~ C dc ) 



2 ) 

(A7) 



Terms 4 and 6: 

The two terms jN 2 (k) 2 and Jlifc™ 1 ^" 1 do not change, 
since our stochastic process conserves all degrees. 



In combination, the above ingredients lead us to the following 
update formula for the square mobility (|9j, as a result of the 
edge swap (|AT]>: 

An n = 2(4" + ftj° + C' + C) 

(^Cj i( fC] im C am + CkbCkmCcm C k b C kmC am CkdCkmCcm) 
4 {C[dC[b + C am C cm + 1) 

+MutN(a, d) + MutN(c, b) - MutN(a, b) 
-MutN(c, d) - 2(c bd + c db + c ac + c ca ) 

+Cda + Cbc - C ba ~ Cdc (A8) 



2. Change in n A (c) following one square-type move 



The different terms in the triangle mobility term (to be called 
Term 7, Term 8, Term 9 and Term 10, to avoid confusion with 
the previous section) are 

n(c) A =5Tr(c 3 ) - Tr^c 2 ) + Tr(c l2 c) - ±Tr(cI 3 ) 



Term 7: 



ATr(c 3 ) = Yj [*ijx. 



jk %ki CijCjk Cfci 



— 3 ^ ^ (c la Cd\ + Cj c Cbi Ci a Cbi C lc C d i) 



• Term 8: 

Here we have to inspect first how the matrix of dou- 
ble bonds is affected by a square move: 



ATr(c J c 2 ) = Tr(xV) - Tr(c : c 2 ) 

= Z (4 + 4 + 4) ( c * + <<* + A ^ - Tr < cI 



ijk 



Arguments similar to those employed before show that 
Yjj AyAji = Yji -Aju = 0, whereas the remaining two 
'compound' terms give 

Yj 4 A ;* Cjfci = 

ijk 

Z { S ja<5idC d a+6jc5ibCbc-<5j a <5ibCb a -Sjc6 id Cdc) 
Uk 

X \5j a Skd + Sj c 6kb ~ Sjadkb ~ 5j c 5kd) Cki 

= (c da + c dc )6id(6kd - 5kb)Cki 

+ (Cbc + Cba)Sib{Skb ~ 8kd)Cki 
= ~{c da + C dc )Cbd - (Cbc + C ba )c d b 



and 

2 ^ifjk^ki = 

ijk 

Yj { 6 ja5idC da + 5jc&ibCbc ~ Sj a S ib C ha - 6j c 5 id C d c) 
ijk 

X (Ska8i d + dkcdib - 6ka5ib - 5k c 6 id ) Cj k 

= (c da +C ba )6j a (_5ka-5kc)Cjk 

+ {Cbc + C dc )6 j c {$kc-5ka)Cjk 
= ~ [(C da + C ha )c ac + (C hc + C dc )c ca ] 

The product of three Deltas can be immediately seen to 
be zero by earlier arguments (repeated suffix in different 
positions). The other terms evaluate as follows: 



Ya* 



-IfjkCki = 
ijk 0e{a,c}, ae{d,b) 

±kic]jCjk = 
ijk /3e(o,c), ae{d,b) 



2^ l(a,p)c ak c a/j c k/3 (A9) 



2 A, 



Y J /s -jkc)fki= 2j l(a,P)c ak c 

ijk /3e(fl,c), ae{d,b) 



with 



where I(a,/3) is an indicator function which evaluates 
to 1 if bond (a, ft) is created by the present move, to 
-1 if the bond (a, ft) is destroyed, and zero otherwise. 
Similarly 

•jk ijk 

X (<5 jadidCda + 5 j c 5ibCbc ~ $ ja&ibCba ~ 5 j c 5i d C dc ^ 
= X ( c akCk d C da +C c kCkbCbc-CakCkbCba-CckC kd C dc ) 



A.. = 5 ia 8j d c d a + 5 ic 5jbCbc - 8 ia 5jbCb a - 6 ic 5j d c d c 
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Putting all of these sub-terms together yields: 
A [TrCcV)] = Y I{a,P) 

Pe{a,c}, ae{d,b] 

k 

-Cbd (Cda + Cdc) + C ac (Cda + Cba) 

+ Cdb (Cbc + C ba ) + C ca (c bc + C dc ) (A 10) 

• Terms 9 and 10: 

The same steps as followed to calculate term 8 can be 
also be applied to terms 9 and 10, In combination, the 
above ingredients lead us to the following update for- 
mula for the triangle mobility ( [T0| , as a result of the 
edge swap ( |A1[ >: 

An A = 2^ Ma,P) 2^ [ c akCkp - c a p{c ak c k fi + cp k c ka ) 

jS6{1,3}, cke{4,2} A- 

~^tk C *P ~ CakC lj3 + C ak C lp ~ ° a P Vak C \p + C L C jL) 

+c ap {c X ak c kp + c ak c\ p + c\ a c pk + CfoC^)] 

-CbdiCdb - 1) (Qfl(l - C/, fl ) + Cd c (l - c ic )) 

-c flc (c cfl - 1) (c da (l - c de ) + c 6fl (l - c bc j) 

-CdbiCbd ~ 1) (c/,e(l - C dc ) + C ba (\ - C da )) 

-C ca (Cac ~ 1) - C fea ) + Q c (l - C da )) (Al 1) 



3. Change in n D (C) following one triangle-type move 

The triangle move is a transformation from network c to net- 
work x, characterised by Xij = Cy + £2y with 

Qy = 5i b 5j a +5i C 5jb+6ia6j C -5j a 5jt-5 ib 5j C -5 ic 5j a (All) 

The terms which make up the square mobility term are 

n n (c) = ^Tr(ccW) - ^ fc^cyifcj? + Tr(cc f c) 

y 

+ ^W + iTr(c 2 )-^C , C 

I 

• Term 2: 

0' avSe{l,2,3} 

• Term 3: 

A[Tr(cc + c)] = Tr(xx f x) - Tr(cc 1 c) 

ijk 



We consider each subterm separately: 

^ OyCjyCfc = ^ Q W %Cy = 

y* y* 

^&kjCijCki= ^iV'fr/^CaiCip 

ijk a <( S6(l,2,3) i 

Clearly 

V QyOj ; - = ^ Qki^kj - 

v* y* 

since the Q. kills any suffix repeated in the same posi- 
tion. Furthermore, 

hence 

Q-ijQ ki c k j = 3 

ijk 

So it follows that 
Tr(xx f x) - Tr(cc l c) = 3 + J] I(ar,j8) ^ 

a,/3e(a,b,c] i 

• Term 5: 

We observe that £y c/fQy = 3 and £y QijQji = _ 6. We 
conclude that A[Tr(c 2 )] = 0. This is as expected, since 
double bonds cannot participate in a triangle swap. 

• Term 1: 

Finally we return to Term 1 using the various shortcuts 
derived above. We recall that a suffix repeated in the 
same position sends the term to zero. Hence, we already 
know that all terms featuring the product of 3 or 4 Q. 
terms will be zero. Next: 

j afie\ 1,2,3} 

From this it follows that 

^ i ^ijCkj^kmCjm — 

y 

Finally, 

^ ] ^ijCkjCkmCim — C km \.C k \ (C2m ^-3m) 
ijkm 

+Q2 (C3m - Clm) + Q 3 (C lm - C 2m )] 

(and similarly with the other terms related to this one 
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by simple permutations). Overall we thus find 



4. Change in n A (c) following one triangle-type move 



ATrCccW) = 4 J] c hn J] I(a,j8)c Q 



Collecting all these terms together, we see that the expected 
change in the square mobility term after the application of a 
single triangle type move is 

A [«□] = Ma,fi)[c ai c i/3 +k° ut k^+2 c km c am c k/3 ] + 3 

afie\a,b,c\ km 

(A13) 



This final incremental term is best evaluated by an algo- 
rithm which, for each edge created or destroyed, searches for 
mono-directed triangles that have been created or destroyed. 



